19 research outputs found

    The dawn of the human-machine era: a forecast of new and emerging language technologies

    Get PDF
    New language technologies are coming, thanks to the huge and competing private investment fuelling rapid progress; we can either understand and foresee their effects, or be taken by surprise and spend our time trying to catch up. This report scketches out some transformative new technologies that are likely to fundamentally change our use of language. Some of these may feel unrealistically futuristic or far-fetched, but a central purpose of this report - and the wider LITHME network - is to illustrate that these are mostly just the logical development and maturation of technologies currently in prototype. But will everyone benefit from all these shiny new gadgets? Throughout this report we emphasise a range of groups who will be disadvantaged and issues of inequality. Important issues of security and privacy will accompany new language technologies. A further caution is to re-emphasise the current limitations of AI. Looking ahead, we see many intriguing opportunities and new capabilities, but a range of other uncertainties and inequalities. New devices will enable new ways to talk, to translate, to remember, and to learn. But advances in technology will reproduce existing inequalities among those who cannot afford these devices, among the world's smaller languages, and especially for sign language. Debates over privacy and security will flare and crackle with every new immersive gadget. We will move together into this curious new world with a mix of excitement and apprehension - reacting, debating, sharing and disagreeing as we always do. Plug in, as the human-machine era dawn

    Topic-based Historical Information Selection for Personalized Sentiment Analysis

    Get PDF
    In this paper, we present a selection approach designed for personalized sentiment analysis with the aim of extracting related information from a user's history. Analyzing a person's past is key to modeling individuality and understanding the current state of the person. We consider a user's expressions in the past as historical information, and target posts from social platforms for which Twitter texts are chosen as exemplary. While implementing the personalized model PERSEUS, we observed information loss due to the lack of flexibility regarding the design of the input sequence. To compensate this issue, we provide a procedure for information selection based on the similarities in the topics of a user's historical posts. Evaluation is conducted comparing different similarity measures, and improvements are seen with the proposed method

    Looking into the Past: Evaluating the Effect of Time Gaps in a Personalized Sentiment Model

    Get PDF
    This paper concerns personalized sentiment analysis, which aims at improving the prediction of the sentiment expressed in a piece of text by considering individualities. Mostly, this is done by relating to a person’s past expressions (or opinions), however the time gaps between the messages are not considered in the existing works. We argue that the opinion at a specific time point is affected more by recent opinions that contain related content than the earlier or unrelated ones, thus a sentiment model ought to include such information in the analysis. By using a recurrent neural network with an attention layer as a basic model, we introduce three cases to integrate time gaps in the model. Evaluated on Twitter data with frequent users, we have found that the performance is improved the most by including the time information in the Hawkes process, and it is also more effective to add the time information in the attention layer than at the input

    PERSEUS: A Personalization Framework for Sentiment Categorization with Recurrent Neural Network

    Get PDF
    This paper introduces the personalization framework PERSEUS in order to investigate the impact of individuality in sentiment categorization by looking into the past. The existence of diversity between individuals and certain consistency in each individual is the cornerstone of the framework. We focus on relations between documents for user-sensitive predictions. Individual’s lexical choices act as indicators for individuality, thus we use a concept-based system which utilizes neural networks to embed concepts and associated topics in text. Furthermore, a recurrent neural network is used to memorize the history of user’s opinions, to discover user-topic dependence, and to detect implicit relations between users. PERSEUS also offers a solution for data sparsity. At the first stage, we show the benefit of inquiring a user-specified system. Improvements in performance experimented on a combined Twitter dataset are shown over generalized models. PERSEUS can be used in addition to such generalized systems to enhance the understanding of user’s opinions

    Corpus of long-term instant messaging based dialogues between advanced learners of German as a foreign language and German native speakers: deL1L2IM

    No full text
    The deL1L2IM corpus, created between May and August 2012 and last updated in August 2014, has been collected within the framework of a PhD project on the development of a learning method implying conversations with an artificial companion. This PhD work is presented as a qualitative investigation of instant messaging dialogues on a long-term basis (four months) between advanced learners of German and German native speakers, chatting about whatever topic they wish. The dataset is composed of 72 dialogues, each of them having a duration of 20 to 45 minutes. The whole corpus contains ca. 52,000 words and 4,800 messages and has a file size of 0.5 Mb. Nine pairs of participants – i.e. nine learners and four native speakers – were required, with 8 dialogues per pair. The interactions have undergone linguistic analysis whereby the annotation will be performed only on repair/correction sequences (incomplete learner error annotation). The goal of the project was to create an application for language modelling and to improve learner language applications, tutoring software and dialogue systems. The corpus is delivered in one written text file (in XML format, customized under TEI P5)

    A data-driven model of explanations for a chatbot that helps to practice conversation in a foreign language

    No full text
    This article describes a model of other-initiated self-repair for a chatbot that helps to practice conversation in a foreign lan- guage. The model was developed using a corpus of instant messaging conversations between German native and non-native speakers. Conversation Analysis helped to create computational models from a small number of examples. The model has been validated in an AIML-based chatbot. Unlike typical retrieval-based dialogue systems, the explanations are generated at run-time from a linguistic database

    Dealing with Trouble: A Data-Driven Model of a Repair Type for a Conversational Agent

    Get PDF
    Troubles in hearing, comprehension or speech production are common in human conversations, especially if participants of the conversation communicate in a foreign language that they have not yet fully mastered. Here I describe a data-driven model for simulation of dialogue sequences where the learner user does not understand the talk of a conversational agent in chat and asks for clarification

    Data-driven Repair Models for Text Chat with Language Learners

    Get PDF
    This research analyses participants' orientation to linguistic identities in chat and introduces data-driven computational models for communicative Intelligent Computer-Assisted Language Learning (communicative ICALL). Based on non-pedagogical chat conversations between native speakers and non-native speakers, computational models of the following types are presented: exposed and embedded corrections, explanations of unknown words following learner's request. Conversation Analysis helped to obtain patterns from a corpus of dyadic chat conversations in a longitudinal setting, bringing together German native speakers and advanced learners of German as a foreign language. More specifically, this work states a bottom-up, data-driven research design which takes “conversation” from its genuine personalised dyadic environment to a model of a conversational agent. It allows for an informal functional specification of such an agent to which a technical specification for two specific repair types is provided. Starting with the open research objective to create a machine that behaves like a language expert in an informal conversation, this research shows that various forms of orientation to linguistic identities are on participants' disposal in chat. In addition it shows that dealing with computational complexity can be approached by a separation between local models of specific practices and a high-level regulatory mechanism to activate them. More specifically, this work shows that learners' repair initiations may be analysed as turn formats containing resources for signalling trouble and referencing trouble source. Based on this finding, this work shows how computational models for recognition of the repair initiations and trouble source extraction can be formalised and implemented in a chatbot. Further, this work makes clear which level of description of error corrections is required to satisfy computational needs, and how these descriptions may be transformed to patterns for various error correction formats and which technological requirements they imply. Finally, this research shows which factors in interaction influence the decision to correct and how the creation of a high-level decision model for error correction in a Conversation-for-Learning can be approached. In sum, this research enriches the landscape of various communication setups between language learners and communicative ICALL systems explicitly covering Conversations-for-Learning. It strengthens multidisciplinary connections by showing how the multidisciplinary research field of ICALL benefits from including Conversation Analysis into the research paradigm. It highlights the impact of the micro-analytic understanding of actions accomplished by utterances in talk within a specific speech exchange system on computational modelling on the example of chat with language learners

    Artificial Companion for Second Language Conversation

    No full text
    corecore